Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VReplication: Ensure that RowStreamer uses optimal index when possible #13893

Merged
merged 9 commits into from
Oct 1, 2023

Conversation

mattlord
Copy link
Contributor

@mattlord mattlord commented Aug 30, 2023

Description

In this PR we add a force index clause (and not always primary) when we have PK or equivalent columns covered by an existing index, and we then use all of those columns in the ORDER BY clause. So it’s for the low level rowstreamer source query we apply at the mysqld layer to get rows for streaming, before applying the vreplication filters in the vstreamer layer in the source tablet and finally sending the matching rows on to the vcopier in the target tablet. Using an index for the ORDER BY clause there in the mysqld layer is what allows us to immediately start streaming rows rather than having to read every matching row and sort them all on disk with a filesort in mysqld before the vstreamer gets any rows back from mysqld/rowstreamer.

Note
✅ My current plan is to hold off on merging this until after the upcoming 18.0-RC cutoff (scheduled for Sept 29th) so that it has the full v19 dev cycle to bake and surface any unforeseen edge cases.

Related Issue(s)

Checklist

  • "Backport to:" labels have been added if this change should be back-ported
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on the CI
  • Documentation was added or is not required

@vitess-bot
Copy link
Contributor

vitess-bot bot commented Aug 30, 2023

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • Ensure there is a link to an issue (except for internal cleanup and flaky test fixes), new features should have an RFC that documents use cases and test cases.

Tests

  • Bug fixes should have at least one unit or end-to-end test, enhancement and new features should have a sufficient number of tests.

Documentation

  • Apply the release notes (needs details) label if users need to know about this change.
  • New features should be documented.
  • There should be some code comments as to why things are implemented the way they are.
  • There should be a comment at the top of each new or modified test to explain what the test does.

New flags

  • Is this flag really necessary?
  • Flag names must be clear and intuitive, use dashes (-), and have a clear help text.

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow needs to be marked as required, the maintainer team must be notified.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from vitess-operator and arewefastyet, if used there.
  • vtctl command output order should be stable and awk-able.

@vitess-bot vitess-bot bot added NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsWebsiteDocsUpdate What it says labels Aug 30, 2023
@github-actions github-actions bot added this to the v18.0.0 milestone Aug 30, 2023
This is the index that should contain the columns specified in the
ORDER BY clause and allows us to leverage the record ordering in
the index and avoid having to do a filesort.

Signed-off-by: Matt Lord <[email protected]>
@mattlord mattlord force-pushed the rowstreamer_force_index branch from 1211aff to 7e10b5d Compare August 30, 2023 04:37
Signed-off-by: Matt Lord <[email protected]>
@mattlord mattlord added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: VReplication and removed NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsWebsiteDocsUpdate What it says labels Aug 31, 2023
@mattlord mattlord removed this from the v18.0.0 milestone Sep 6, 2023
@mattlord mattlord removed the NeedsIssue A linked issue is missing for this Pull Request label Sep 6, 2023
Copy link
Contributor

@rohit-nayak-ps rohit-nayak-ps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mattlord mattlord removed the request for review from systay September 29, 2023 17:16
@mattlord mattlord removed the request for review from harshit-gangal September 29, 2023 17:16
@mattlord mattlord added this to the v19.0.0 milestone Sep 29, 2023
Copy link
Contributor

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes perfect sense, with one minor comment.

go/vt/vttablet/tabletserver/vstreamer/rowstreamer.go Outdated Show resolved Hide resolved
@mattlord mattlord merged commit d5efe8e into vitessio:main Oct 1, 2023
@mattlord mattlord deleted the rowstreamer_force_index branch October 1, 2023 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: VReplication Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
Status: Done
3 participants